Skip to content

fix: Fixing issue with first gen token being returned twice in streaming#3427

Merged
pcastonguay merged 9 commits into
NVIDIA:mainfrom
pcastonguay:first_double_token_fix
Apr 14, 2025
Merged

fix: Fixing issue with first gen token being returned twice in streaming#3427
pcastonguay merged 9 commits into
NVIDIA:mainfrom
pcastonguay:first_double_token_fix

Conversation

@pcastonguay
Copy link
Copy Markdown
Collaborator

Better fix for first gen token being returned twice in streaming mode.

@pcastonguay pcastonguay requested a review from Shunkangz April 9, 2025 18:58
@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from dfaebf0 to 384d556 Compare April 9, 2025 19:04
@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@pcastonguay pcastonguay requested a review from Tabrizian April 9, 2025 19:19
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1641 [ run ] triggered by Bot

Copy link
Copy Markdown
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Patrice!

Comment thread tests/integration/test_lists/waives.txt Outdated
@Shunkangz
Copy link
Copy Markdown
Collaborator

LGTM. Thank you!

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1641 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1228 completed with status: 'FAILURE'

@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from df06fa2 to d90b2b3 Compare April 10, 2025 12:13
@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --add-multi-gpu-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1773 [ run ] triggered by Bot

…aming

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from d90b2b3 to 45ac435 Compare April 11, 2025 01:09
@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1827 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1773 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1315 completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1827 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1351 (Partly Tested) completed with status: 'FAILURE'

@Shunkangz
Copy link
Copy Markdown
Collaborator

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1839 [ run ] triggered by Bot

@QiJune
Copy link
Copy Markdown
Collaborator

QiJune commented Apr 11, 2025

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1839 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1361 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1865 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #1865 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1382 (Partly Tested) completed with status: 'FAILURE'

@QiJune
Copy link
Copy Markdown
Collaborator

QiJune commented Apr 12, 2025

/bot run --only-multi-gpu-test

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --add-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2050 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2050 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1509 completed with status: 'FAILURE'

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --stage-list "L40S-TensorRT-3"

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2059 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2060 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2059 [ run ] completed with state ABORTED

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2061 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2060 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2061 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1518 (Partly Tested) completed with status: 'SUCCESS'

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot run --stage-list "L40S-TensorRT-3"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2070 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2070 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1525 (Partly Tested) completed with status: 'SUCCESS'

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2084 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2084 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2070 (Partly Tested) for commit febb026

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "ran all tests previously"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2087 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2087 [ skip ] completed with state SUCCESS
Skipping testing for commit 57726cd

@pcastonguay
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "Ran all tests previously"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2092 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #2092 [ skip ] completed with state SUCCESS
Skipping testing for commit b2c0059

@pcastonguay pcastonguay merged commit fe6f14b into NVIDIA:main Apr 14, 2025
3 checks passed
wu1du2 pushed a commit to wu1du2/TensorRT-LLM that referenced this pull request May 11, 2025
…ing (NVIDIA#3427)

* fix: Fixing issue with first gen token being returned twice with streaming

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Fixing not_expectring_strings in test

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

---------

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants